Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use a tracing subscriber for LSP logging #93

Merged
merged 15 commits into from
Dec 13, 2024

Conversation

DavisVaughan
Copy link
Collaborator

@DavisVaughan DavisVaughan commented Dec 11, 2024

Branched from #80

The LSP now activates a global tracing "subscriber" for all tracing::error!() (and friends) events. This means that we can use the free functions tracing provides to log from anywhere in the LSP (or from the parser/formatter internals) and it will get captured by this global subscriber. (Notably, I have not activated the compatibility helper for log::, and have removed that crate).

The global subscriber implements a write() method that forwards the message in a non-async way to a new logging specific thread. That logging thread can then call client.log_message().await to perform an async send of the message to the client. That's the only job of that thread.

The AuxiliaryEvent thread has gotten a bit simpler since we don't worry about logging in there now.


Checking for air.logLevel and air.dependencyLogLevels is not yet done. That is going to be 99% a frontend PR with lots of changes required in the typescript extension, and this PR felt big enough as is. So I'll wait to do that. But here is the general idea:

  • User sets air.logLevel to error/info/warn/debug/trace in their user level settings.json (note, ignoring workspace level)
  • Forward that value to the server using the initializationOptions field of type LSPAny within the InitializeParams. The client and server will need to have matching representations of what initializationOptions looks like, but I'm imagining a Settings struct where one field is logLevel.
  • Server gets that value in the initialize() handler and uses it to finalize the logLevel and fully set up logging

Note that this approach would work for other IDEs besides VS Code. The equivalent to the air vs code extension would just have to make sure to pass a structured initializationOptions field through to the server.

https://microsoft.github.io/language-server-protocol/specifications/lsp/3.17/specification/#initialize

// TODO: Get user specified log level from `params.initialization_options`
let log_level = None;

logging::init_logging(log_tx, log_level, params.client_info.as_ref());
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Initializing logging in the initialize() handler is as early as I think we can initialize it. We are going to require information from the Client to correctly set up our logging infrastructure, so we can't do this any earlier.

Comment on lines +69 to +74
/// We use `MessageType::LOG` to prevent the middleware from adding its own
/// timestamp and log level labels. We add that ourselves through tracing.
pub(crate) async fn start(mut self) {
while let Some(message) = self.log_rx.recv().await {
self.client
.log_message(MessageType::LOG, message.contents)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We always forward log messages of type LOG to the middleware.

Sending LOG messages means the middleware calls the raw outputChannel.appendLine() here:
https://github.com/microsoft/vscode-languageserver-node/blob/906f5fb306e1f6059cbdcb1efd962647222b5867/client/src/common/client.ts#L1281-L1297

I prefer that over going through ERROR or WARN in the middleware, which goes through this logOutputMessage() helper:
https://github.com/microsoft/vscode-languageserver-node/blob/906f5fb306e1f6059cbdcb1efd962647222b5867/client/src/common/client.ts#L1179-L1187

That helper unconditionally prepends the log level and a timestamp to the message, but we already do this.

It will also show a notification as a toast message to the user if showNotification is true or "force", but it's always false when processing server log messages. This means that literally the only thing we would get from the middleware by sending MessageType::ERROR is for it to automatically prepend [Error - <time>] to our messages, so it seems fine to skip this.

Comment on lines 138 to 144
let writer = if client_info.is_some_and(|client_info| {
client_info.name.starts_with("Zed") || client_info.name.starts_with("Visual Studio Code")
}) {
BoxMakeWriter::new(LogWriterMaker::new(log_tx))
} else {
BoxMakeWriter::new(std::io::stderr)
};
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doing what ruff does here. It seems like VS Code and Zed handle window/logMessage well, but perhaps not every lsp client does? So they specially approve ones that handle it well and send to stderr otherwise.

Comment on lines +159 to +160
// Display local time rather than UTC
.with_timer(LocalTime::rfc_3339())
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nicely actually displays logs in your local time!

Comment on lines 168 to 185
if !is_test_client(client_info) {
tracing::subscriber::set_global_default(subscriber)
.expect("Should be able to set the global subscriber.");
}

tracing::info!("Logging initialized with level: {log_level}");
}

/// We never log during tests as tests run in parallel within a single process,
/// but you can only have 1 global subscriber per process.
///
/// If you are debugging a single test, you can override this to emit messages to stderr.
///
/// Note that if you override this and run multiple tests in parallel, then the call
/// to `set_global_default()` will error causing a panic.
fn is_test_client(client_info: Option<&ClientInfo>) -> bool {
client_info.map_or(false, |client_info| client_info.name == "AirTestClient")
}
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Important notes here about skipping logging during tests. They are very noisy in the test output, and there are issues with running tests in parallel since the "global subscriber" is by nature a "once per process" kind of thing.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That seems strange to me. It feels like it would be better to control this with the log level. By default we don't see this output since it's captured anyway.

Tests are run in parallel that's true, but I don't think it's customary to disable logging in tests altogether? It seems like we're making it harder to debug tests here.

Comment on lines 251 to 259
// This is a potential reason to use `air_` as the crate prefix,
// it would make it easy to set the `tracing_level()` for only air related crates
let filter = if meta.target().starts_with("air") || meta.target().starts_with("lsp") {
self.filter.tracing_level()
} else {
tracing::Level::INFO
};

meta.level() <= &filter
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remember that tons of other crates use tracing::info!() and friends internally

This filter uses target, which is typically the module name, to determine the tracing level to filter with, and we never go below Info for 3rd party crates.

This is actually one reason it would be quite nice to prefix all of our crates with air_! Detection would be very simple.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the filter implemented in ark is more flexible? https://github.com/posit-dev/ark/blob/3b8ff6112d14f1b78640e4db55c82754c0b362bf/crates/ark/src/logger.rs#L26-L43

Is there any reason not to use this approach?

Comment on lines 271 to 272
set.shutdown().await;

log::trace!("Main loop exited.");
}
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remember that after the log loop is shutdown, we won't be able to see anymore messages. I've removed a few messages that would otherwise occur too soon or too late because of this to prevent confusion.

Comment on lines 329 to +332
LspRequest::Initialize(params) => {
respond(tx, handlers_state::initialize(params, &mut self.lsp_state, &mut self.world), LspResponse::Initialize)?;
// Unwrap: `Initialize` method should only be called once.
let log_tx = self.log_tx.take().unwrap();
respond(tx, handlers_state::initialize(params, &mut self.lsp_state, &mut self.world, log_tx), LspResponse::Initialize)?;
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I quite like how lsp_server works here. Instead of Initialize being an event you handle in your main loop, it is instead handled through explicit calls to initialize_start() to receive the Initialize request, and initialize_finish() to send back an InitializeResult.

I like that because it really drives home that they are once-per-session methods, not ones that can be recalled through the main loop.

https://docs.rs/lsp-server/latest/lsp_server/struct.Connection.html#method.initialize_start

Comment on lines -191 to -197
log::trace!("Starting LSP");

let (service, socket) = new_lsp();
let server = tower_lsp::Server::new(read, write, socket);
server.serve(service).await;

log::trace!("LSP exiting gracefully.",);
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again these messages are "too soon" or "too late" from the perspective of the log thread

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could be eprintln!() then

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like it would be quite odd to have two messages get sent to stderr when all other messages end up over window/logMessage. I feel like it just adds one more layer of places to look for logs. Would it be ok to try and drop these for now and see if we need them later on?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure that's fine with me.

Comment on lines +95 to +105
// Regardless of how we got the params, ensure the client name is set to
// `AirTestClient` so we can recognize it when we set up global logging.
fn with_client_info(
mut init_params: lsp_types::InitializeParams,
) -> lsp_types::InitializeParams {
init_params.client_info = Some(ClientInfo {
name: String::from("AirTestClient"),
version: None,
});
init_params
}
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like using AirTestClient as the way to determine "are we testing or not" rather than cfg(debug_assertions) because this will also "look like" testing mode in integration tests!

Comment on lines 168 to 185
if !is_test_client(client_info) {
tracing::subscriber::set_global_default(subscriber)
.expect("Should be able to set the global subscriber.");
}

tracing::info!("Logging initialized with level: {log_level}");
}

/// We never log during tests as tests run in parallel within a single process,
/// but you can only have 1 global subscriber per process.
///
/// If you are debugging a single test, you can override this to emit messages to stderr.
///
/// Note that if you override this and run multiple tests in parallel, then the call
/// to `set_global_default()` will error causing a panic.
fn is_test_client(client_info: Option<&ClientInfo>) -> bool {
client_info.map_or(false, |client_info| client_info.name == "AirTestClient")
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That seems strange to me. It feels like it would be better to control this with the log level. By default we don't see this output since it's captured anyway.

Tests are run in parallel that's true, but I don't think it's customary to disable logging in tests altogether? It seems like we're making it harder to debug tests here.

Comment on lines 251 to 259
// This is a potential reason to use `air_` as the crate prefix,
// it would make it easy to set the `tracing_level()` for only air related crates
let filter = if meta.target().starts_with("air") || meta.target().starts_with("lsp") {
self.filter.tracing_level()
} else {
tracing::Level::INFO
};

meta.level() <= &filter
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the filter implemented in ark is more flexible? https://github.com/posit-dev/ark/blob/3b8ff6112d14f1b78640e4db55c82754c0b362bf/crates/ark/src/logger.rs#L26-L43

Is there any reason not to use this approach?

crates/lsp/src/tower_lsp.rs Outdated Show resolved Hide resolved
Comment on lines -191 to -197
log::trace!("Starting LSP");

let (service, socket) = new_lsp();
let server = tower_lsp::Server::new(read, write, socket);
server.serve(service).await;

log::trace!("LSP exiting gracefully.",);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could be eprintln!() then

- `just test` is silent
- `just test-verbose` is noisy and sequential
- CI is noisy and sequential and `trace` level
Copy link
Collaborator Author

@DavisVaughan DavisVaughan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay after some reworking and talking to @lionel- here is where we landed:

  • cargo test is silent, no logs. Aliased to just test.

  • cargo test -- --nocapture shows logs, but you really want to run with --test-threads 1 so the logging is scoped to the test it applies to. Use just test-verbose for this.

  • AIR_LOG_LEVEL=trace allows you to set the log level of air crates. air.logLevel will be the VS Code option. The option will be an enum with autocomplete.

  • AIR_DEPENDENCY_LOG_LEVELS=tokio=info,tower_lsp=debug allows you to set the log level of 3rd party crates. air.dependencyLogLevels will be the VS Code option. The option will be free text.

  • Note that by default, we do Info logs for air crates and NO logging for 3rd party crates.

  • CI runs with cargo test -- --nocapture --test-threads 1 and AIR_LOG_LEVEL=trace. This should make it pretty helpful if something goes wrong on CI.

Comment on lines +7 to +37
//! The logging system for `air lsp`.
//!
//! ## Air crate logs
//!
//! For air crates, a single log level is supplied as one of: error, warn, info, debug,
//! or trace, which is applied to all air crates that log.
//!
//! Resolution strategy:
//!
//! - The environment variable `AIR_LOG_LEVEL` is consulted.
//!
//! - The LSP `InitializeParams.initializationOptions.logLevel` option is consulted. This
//! can be set in VS Code / Positron using `air.logLevel`, or in Zed by supplying
//! `initialization_options`.
//!
//! - If neither are supplied, we fallback to `"info"`.
//!
//! ## Dependency crate logs
//!
//! For dependency crates, either a single log level can be supplied, or comma separated
//! `target=level` pairs can be supplied, like `tower_lsp=debug,tokio=info`.
//!
//! Resolution strategy:
//!
//! - The environment variable `AIR_DEPENDENCY_LOG_LEVELS` is consulted.
//!
//! - The LSP `InitializeParams.initializationOptions.dependencyLogLevels` option is
//! consulted. This can be set in VS Code / Positron using `air.dependencyLogLevel`, or
//! in Zed by supplying `initialization_options`.
//!
//! - If neither are supplied, we fallback to no logging for dependency crates.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is the newly updated resolution strategy

crates/lsp/src/logging.rs Outdated Show resolved Hide resolved
Comment on lines +77 to +91
impl std::fmt::Display for LspNotification {
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
f.write_str(self.into())
}
}
impl std::fmt::Display for LspRequest {
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
f.write_str(self.into())
}
}
impl std::fmt::Display for LspResponse {
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
f.write_str(self.into())
}
}
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using strum::IntoStaticStr, we make the Display method show just the enum variant name

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cool!

Comment on lines +119 to +133
impl std::fmt::Debug for TraceLspNotification<'_> {
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
match self.inner {
LspNotification::DidOpenTextDocument(params) => {
// Ignore the document itself in trace logs
f.debug_tuple(self.inner.into())
.field(&params.text_document.uri)
.field(&params.text_document.version)
.field(&params.text_document.language_id)
.finish()
}
_ => std::fmt::Debug::fmt(self.inner, f),
}
}
}
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It ended up being easier to write a wrapper type that overrides specific variants but otherwise pushes through to the "normal" derived Debug implementation. If we try and implement Debug on LspNotification directly then we lose everything it auto-derives for us, which is actually quite a lot of repetitive code.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking about this problem after our call yesterday. That's a nice solution.

@DavisVaughan DavisVaughan requested a review from lionel- December 12, 2024 21:58
Comment on lines +62 to +63
// TODO:
// - Add `air.logLevel` and `air.dependencyLogLevels` as VS Code extension options that set
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm also considering going the rust-analyzer route of having lots of smaller namespaces for options, like in this case:

air.log.level
air.log.dependencyLevels

which would come in as a LogSettings struct containing level and dependencyLevels fields.

That seems like it would be useful for grouping related options?

Copy link
Collaborator

@lionel- lionel- left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great!

My only comment is that I think we should unconditionally log the LSP message types at info level.

Comment on lines +77 to +91
impl std::fmt::Display for LspNotification {
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
f.write_str(self.into())
}
}
impl std::fmt::Display for LspRequest {
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
f.write_str(self.into())
}
}
impl std::fmt::Display for LspResponse {
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
f.write_str(self.into())
}
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cool!

crates/lsp/src/tower_lsp.rs Show resolved Hide resolved
Comment on lines +119 to +133
impl std::fmt::Debug for TraceLspNotification<'_> {
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
match self.inner {
LspNotification::DidOpenTextDocument(params) => {
// Ignore the document itself in trace logs
f.debug_tuple(self.inner.into())
.field(&params.text_document.uri)
.field(&params.text_document.version)
.field(&params.text_document.language_id)
.finish()
}
_ => std::fmt::Debug::fmt(self.inner, f),
}
}
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking about this problem after our call yesterday. That's a nice solution.

Comment on lines -191 to -197
log::trace!("Starting LSP");

let (service, socket) = new_lsp();
let server = tower_lsp::Server::new(read, write, socket);
server.serve(service).await;

log::trace!("LSP exiting gracefully.",);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure that's fine with me.

justfile Show resolved Hide resolved
/// Handle to main loop. Drop it to cancel the loop, all associated tasks,
/// and drop all owned state.
_main_loop: tokio::task::JoinSet<()>,
}

impl Backend {
async fn request(&self, request: LspRequest) -> anyhow::Result<LspResponse> {
self.log_info(format!("Incoming: {request:#?}"));
tracing::trace!("Incoming:\n{request:#?}", request = request.trace());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we also discussed a tracing::info() message with the message type so we can better make sense of user logs before they enable trace logging?

client_info.map_or(false, |client_info| client_info.name == "AirTestClient")
}

// TODO: Is there a way to generate this at compile time?
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

chatgpt suggests

cargo metadata --no-deps --format-version 1 | jq -r '.packages[].name'

which gives me

air
air_r_formatter
air_r_syntax
air_formatter_test
line_ending
air_r_parser
air_r_factory
tests_macros
fs
lsp
lsp_test
biome_ungrammar
xtask_codegen
xtask

I guess we can add an exclude list for build-time things like biome_ungrammar. Or there's probably a way to determine dev deps from the metadata.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you remove the jq you get a bunch of json data which is fine to ingest from Rust. This could be done in build.rs.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Of course there's a crate for this. cargo_metadata! 3f0f720

AIR_LOG_LEVEL: trace
# `--nocapture` to see our own `tracing` logs
# `--test-threads 1` to ensure `tracing` logs aren't interleaved
run: cargo test -- --nocapture --test-threads 1
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use the just alias? I don't mind either way, I guess this is more explicit.

@DavisVaughan DavisVaughan merged commit a0bb387 into fix/aux-global Dec 13, 2024
4 checks passed
@DavisVaughan DavisVaughan deleted the feature/tracing-logs branch December 13, 2024 18:52
@DavisVaughan DavisVaughan restored the feature/tracing-logs branch December 13, 2024 18:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants